Extraction of pragmatic and semantic salience from spontaneous spoken English

نویسندگان

  • Tong Zhang
  • Mark Hasegawa-Johnson
  • Stephen E. Levinson
چکیده

This paper computationalizes two linguistic concepts, contrast and focus, for the extraction of pragmatic and semantic salience from spontaneous speech. Contrast and focus have been widely investigated in modern linguistics, as categories that link intonation and information/discourse structure. This paper demonstrates the automatic tagging of contrast and focus for the purpose of robust spontaneous speech understanding in a tutorial dialogue system. In particular, we propose two new transcription tasks, and demonstrate automatic replication of human labels in both tasks. First, we define focus kernel to represent those words that contain novel information neither presupposed by the interlocutor nor contained in the precedent words of the utterance. We propose detecting the focus kernel based on a word dissimilarity measure, part-of-speech tagging, and prosodic measurements including duration, pitch, energy, and our proposed spectral balance cepstral coefficients. In order to measure the word dissimilarity, we test a linear combination of ontological and statistical dissimilarity measures previously published in the computational linguistics literature. Second, we propose identifying symmetric contrast, which consists of a set of words that are parallel or symmetric in linguistic structure but distinct or contrastive in meaning. The symmetric contrast identification is performed in a way similar to the focus kernel detection. The effectiveness of the proposed extraction of symmetric contrast and focus kernel has been tested on a Wizard-of-Oz corpus collected in the tutoring dialogue scenario. The corpus consists of 630 non-single word/phrase utterances, containing approximately 5700 words and 48 min of speech. The tests used speech waveforms together with manual orthographic transcriptions, and yielded an accuracy of 83.8% for focus kernel detection and 92.8% for symmetric contrast detection. Our tests also demonstrated that the spectral balance cepstral coefficients, the semantic dissimilarity measure, and part-of-speech played important roles in the symmetric contrast and focus kernel detections. 2005 Published by Elsevier B.V.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cross–linguistic Comparison of Refusal Speech Act: Evidence from Trilingual EFL Learners in English, Farsi, and Kurdish

To date, little research on pragmatic transfer has considered a multilingual situation where there is an interaction among three different languages spoken by one person. Of interest was whether pragmatic transfer of refusals among three languages spoken by the same person occurs from L1 and L2 to L3, L1 to L2 and then to L3 or from L1 and L1 (if there are more than one L1) to L2. This study ai...

متن کامل

Pragmatic expressions in cross-linguistic perspective

This  paper  focuses  on  some  pragmatic  expressions  that  are  characteristic  of  informal  spoken English, their possible equivalents in some other languages, and their use by EFL learners from different  backgrounds.  These  expressions,  called  general  extenders  (e.g.  and  stuff,  or something), are shown to be different from discourse markers and to exhibit variation in form, funct...

متن کامل

Cross-Cultural Differences and Pragmatic Transfer in English and Persian Refusals

This study aimed to examine cross-cultural differences in performing refusal of requests between Persian native speakers (PNSs) and English native speakers (ENSs) in terms of the frequency of the semantic formulas. Also examined in this study was whether Persian EFL learners would transfer their L1 refusal patterns into the L2, and if there would be a relation between their proficiency level an...

متن کامل

Speech-like Pragmatic Markers in Argumentative Essays Written by Iranian EFL Students and Native English Speaking Students

In this study, the use of speech-like pragmatic markers in Iranian EFL students’ academic writing was investigated. Speech-like pragmatic markers, such as I think, well, I guess, actually, anyway, anyhow, etc. are linguistic components that are more specific to conversation than writing, and writers may wrongly include them in their academic writing. To examine the students’ use of speech-like ...

متن کامل

Speech-like Pragmatic Markers in Argumentative Essays Written by Iranian EFL Students and Native English Speaking Students

In this study, the use of speech-like pragmatic markers in Iranian EFL students’ academic writing was investigated. Speech-like pragmatic markers, such as I think, well, I guess, actually, anyway, anyhow, etc. are linguistic components that are more specific to conversation than writing, and writers may wrongly include them in their academic writing. To examine the students’ use of speech-like ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Speech Communication

دوره 48  شماره 

صفحات  -

تاریخ انتشار 2006